Towards a Semantic Annotation of English Television News - Building and Evaluating a Constraint Grammar FrameNet
نویسندگان
چکیده
This paper introduces a new approach to solve the Chinese Pinyin-to-character (PTC) conversion problem. The conversion from Chinese Pinyin to Chinese character can be regarded as a transformation between two different languages (from the Latin writing system of Chinese Pinyin to the character form of Chinese,Hanzi), which can be naturally solved by machine translation framework. PTC problem is usually regarded as a sequence labeling problem, however, it is more difficult than any other general sequence labeling problems, since it requires a large label set of all Chinese characters for the labeling task. The essential difficulty of the task lies in the high degree of ambiguities of Chinese characters corresponding to Pinyins. Our approach is novel in that it effectively combines the features of continuous source sequence and target sequence. The experimental results show that the proposed approach is much faster, besides, we got a better result and outperformed the existing sequence labeling approaches.
منابع مشابه
Towards a Semantic Annotation of English Television News - Building and Evaluating a Constraint Grammar FrameNet
This paper presents work on the semantic annotation of a multimodal corpus of English television news. The annotation is performed on the second-by-secondaligned transcript layer, adding verb frame categories and semantic roles on top of a morphosyntactic analysis with full dependency information. We use a rulebased method, where Constraint Grammar mapping rules are automatically generated from...
متن کاملThe FrameNet Constructicon
The Berkeley FrameNet Project1 has been engaged since 1997 in discovering and describing the semantic and distributional properties of words in the general vocabulary of English.2 Notions from FRAME SEMANTICS (see Fillmore and Baker 2009 and references therein) provide the basis of the semantic description of the lexical units in the database, and sentences extracted from the FrameNet (FN) text...
متن کاملGraph Methods for Multilingual FrameNets
This paper introduces a new, graphbased view of the data of the FrameNet project, which we hope will make it easier to understand the mixture of semantic and syntactic information contained in FrameNet annotation. We show how English FrameNet and other Frame Semantic resources can be represented as sets of interconnected graphs of frames, frame elements, semantic types, and annotated instances ...
متن کاملThe Impact of Grammar Enhancement on Semantic Resources Induction
In this paper describes the effects of the evolution of an Italian dependency grammar on a task of multilingual FrameNet acquisition. The task is based on the creation of virtual English/Italian parallel annotation corpora, which are then aligned at dependency level by using two manually encoded grammar based dependency parsers. We show how the evolution of the LAS (Labeled Attachment Score) me...
متن کاملFrame Information Transfer from English to Italian
We describe an automatic projection algorithm for transferring frame-semantic information from English to Italian texts, as a first sep towards the creation of Italian FrameNet. Projection of frame semantic information from English to other European languages has already been investigated for German, Swedish and French. With our work, we point out typical features of the Italian language as reg...
متن کامل